Audio processing for improved user experience

ABSTRACT

Methods and systems that facilitate enhanced media capabilities for electronic devices. The enhanced media capabilities enable electronic devices to provide voice calling with concurrent audio playback. The audio playback can originate at the electronic device itself or can be transmitted to the electronic device as part of or together with the voice calling. In addition, the enhanced media capabilities can also provide users of electronic devices with acoustic separation (e.g., spatial positioning) of audio currently provided from a voice call and from audio playback. Still further, the enhanced media capabilities can also provide users of electronic devices with acoustic separation (e.g., spatial positioning) of participants in multi-party calls.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a divisional of prior, co-pending, U.S. patentapplication Ser. No.: 11/525,670, filed on Sep. 21, 2006, and entitledAUDIO PROCESSING FOR IMPROVED USER EXPERIENCE, which is incorporatedherein by reference in its entirety for all purposes.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to electronic devices and, moreparticularly, to enhanced audio processing for electronic devices.

2. Description of the Related Art

Portable electronic devices, such as MP3 players and Personal DigitalAssistants (PDAs), support media playback for users of such devices.Traditionally, other portable electronic devices, such as mobile phones,have not offered media playback. Recently, however, mobile telephoneshave included the functionality of MP3 players so that music can beplayed back for users of the mobile telephones. Unfortunately, however,when a call is incoming to a mobile telephone, music that mightotherwise be played is stopped. After the call has ended, the user canbe prompted to return to the music playback if so desired. Hence, musiccannot be played while using the mobile telephone to engage in a voicecall.

Additionally, mobile telephones and computers (e.g., Internettelephony), can enable users to engage in voice calls with multipleparties. However, one problem with such conventional devices is that auser can easily get confused as to who is talking when there aremultiple participants on a voice call.

Thus, there is need to facilitate improved audio capabilities forelectronic devices.

SUMMARY OF THE INVENTION

The invention pertains to improved methods and systems that facilitateenhanced media capabilities for electronic devices. The enhanced mediacapabilities enable electronic devices to provide voice calling withconcurrent audio playback. The audio playback can originate at theelectronic device itself or can be transmitted to the electronic deviceas part of or together with the voice calling. In addition, the enhancedmedia capabilities can also provide users of electronic devices withacoustic separation (e.g., spatial positioning) of audio currentlyprovided from a voice call and from audio playback. Still further, theenhanced media capabilities can also provide users of electronic deviceswith acoustic separation (e.g., spatial positioning) of participants inmulti-party calls.

The invention can be implemented in numerous ways, including as amethod, system, device, apparatus (including graphical user interface),or computer readable medium. Several embodiments of the invention arediscussed below.

As a method for handling an incoming call at a portable electronicdevice having wireless communication support as well as media playbacksupport, one embodiment of the invention includes at least: receiving anincoming voice call from a calling party; determining whether audioplayback of a media item is provided while receiving the incoming voicecall; and controlling audio output so that a user of the portableelectronic device can hear not only the incoming voice call but also theaudio playback of the media item when it is determined that the audioplayback is provided while receiving the incoming voice call.

As a method for operating a portable electronic device having a mediaplayback subsystem, a wireless communication subsystem and first andsecond speakers, one embodiment of the invention includes at least theacts of: playing back a media item by the media playback subsystem usingfirst and second audio output channels respectively provided to thefirst and second speakers; receiving an incoming communication call tothe wireless communication subsystem; altering the playing back of themedia item while the incoming communication call is being received so asto provide a mono audio output channel to the second speaker and noaudio output channel to the first speaker; and outputting the incomingcall by providing a communication channel to the first speaker while theincoming communication call is being received.

As a method for operating a portable electronic device having a mediaplayback subsystem, a wireless communication subsystem and first andsecond speakers, one embodiment of the invention includes at least theacts of: receiving audio for a media item being played back by the mediaplayback subsystem; receiving an incoming communication call to thewireless communication subsystem; altering the audio pertaining to theincoming communication call and audio from the media item being playedback so as to appear to be originating from different virtual positions;and producing a resulting audio by supplying to the first and secondspeakers both the altered audio for the incoming communication call andthe altered audio for the media item being played back.

As a portable electronic device, one embodiment of the inventionincludes at least: an audio playback subsystem that plays back one ormore stored media items; a communication subsystem that supports a voicecall; and an audio manager that operates to determine whether audioplayback of a stored media item is to be provided while engaging in avoice call, and to direct audio output so that a user of the portableelectronic device can hear not only the voice call but also the audioplayback of the stored media item when it is determined that the audioplayback is provided while engaging in the voice call.

As a method for providing a multi-party call on a communication devicehaving associated therewith at least two speakers available for audiooutput, the multi-party call being with a user of the communicationdevice and a plurality of other participants, one embodiment of theinvention includes at least the acts of: assigning the participants tovirtual positions; receiving call audio from the participants during themulti-party call; adapting the call audio by the participants based onthe virtual positions corresponding thereto; and presenting the adaptedcall audio to the at least two speakers associated with thecommunication device.

As a graphical user interface for use in managing virtual locations fora plurality of participants to a multi-party call, one embodiment of theinvention includes at least a plurality of visually distinct regions,and a visual indication for at least a plurality of the participants.The visual indication for at least one of the participants can beassigned to a different one of the visually distinct regions, therebycausing an audio sound associated with the participant to be spatiallyadapted to originate from a virtual location corresponding to thevisually distinct region.

As a portable communication device having at least two speakersavailable for audio output, one embodiment of the invention includes atleast: a communication subsystem that supports a multi-party call, themulti-party call being between a user of the portable communicationdevice and a plurality of other participants; and an audio manager thatoperates to assign the participants to virtual positions, receive callaudio from the participants during the multi-party call, adapt the callaudio by the participants based on the virtual positions correspondingthereto, and send the adapted call audio to the at least two speakers.

Other aspects and advantages of the invention will become apparent fromthe following detailed description taken in conjunction with theaccompanying drawings which illustrate, by way of example, theprinciples of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will be readily understood by the following detaileddescription in conjunction with the accompanying drawings, wherein likereference numerals designate like structural elements, and in which:

FIG. 1 is a block diagram of a wireless audio system according to oneembodiment of the invention.

FIG. 2 is a block diagram of a media player suitable for use with theinvention.

FIG. 3A is a flow diagram of a call reception process according to oneembodiment of the invention.

FIG. 3B is a flow diagram of a call determination process according toone embodiment of the invention.

FIGS. 4A-4C are flow diagrams of an audio management process accordingto one embodiment of the invention.

FIG. 5 is a table illustrating representative audio management for aportable electronic device supporting both audio playback and wirelessvoice communications according to one embodiment of the invention.

FIG. 6 is an exemplary display screen suitable for use on a portableelectronic device according to one embodiment of the invention.

FIG. 7 is a diagram of a multi-party conference system according to oneembodiment of the invention.

FIG. 8A is a flow diagram of a spatial conference process according toone embodiment of the invention.

FIG. 8B is a flow diagram of a spatial conference process according toanother embodiment of the invention.

FIG. 9 is a diagram of a virtual space for a multi-party conference callaccording to one embodiment of the invention.

FIG. 10A is a representation of a conference call screen according toone embodiment of the invention.

FIG. 10B is a diagram of an exemplary representation of a multi-partyparticipant position screen according to one embodiment of theinvention.

FIG. 10C is a diagram of a multi-party participant position screenincluding participant information according to one embodiment of theinvention.

FIG. 10D is a diagram of another exemplary representation of amulti-party participant position screen according to one embodiment ofthe invention.

DETAILED DESCRIPTION OF THE INVENTION

The invention pertains to improved methods and systems that facilitateenhanced media capabilities for electronic devices. The enhanced mediacapabilities enable electronic devices to provide voice calling withconcurrent audio playback. The audio playback can originate at theelectronic device itself or can be transmitted to the electronic deviceas part of or together with the voice calling. In addition, the enhancedmedia capabilities can also provide users of electronic devices withacoustic separation (e.g., spatial positioning) of audio currentlyprovided from a voice call and from audio playback. Still further, theenhanced media capabilities can also provide users of electronic deviceswith acoustic separation (e.g., spatial positioning) of participants inmulti-party calls.

“Media data,” as used herein, is digital data that pertains to at leastone of audio, video, and image. Some examples of specific forms of mediadata (which can be referred to as “media items”) include, but are notlimited to, songs, albums, audiobooks, playlists, movies, music videos,photos, computer games, podcasts, audio and/or video presentations, newsreports, and sports updates.

Embodiments of the invention are discussed below with reference to FIGS.1-10D. However, those skilled in the art will readily appreciate thatthe detailed description given herein with respect to these figures isfor explanatory purposes as the invention extends beyond these limitedembodiments.

Various aspects, embodiments and implementations of media utilizationare described below. These aspects, embodiments and implementations canbe utilized separately or in any combination.

One aspect of the invention pertains to a wireless system that supportsboth wireless communications and media playback. The wirelesscommunications and the media playback can be concurrently supported.Consequently, a user is able to not only participate in a voice call butalso hear audio playback at the same time.

FIG. 1 is a block diagram of a wireless system 100 according to oneembodiment of the invention. The wireless system 100 includes one ormore portable electronic devices. In particular, portable electronicdevices 102 and 104 are illustrated in FIG. 1. The wireless system 100supports one or more communication devices, such as the communicationdevice 106 illustrated in FIG. 1. As an example, the communicationdevice 106 can be a mobile telephone.

The portable electronic device 102 includes an audio subsystem 110 and acommunication subsystem 112. The portable electronic device 102 canstore audio items that can be played by the audio subsystem 110. A userof the portable electronic device 102 can utilize a headset 114 that cancouple (wired or wirelessly) via a link 116 to the portable electronicdevice 102. The headset 114 typically has a left speaker 118 and a rightspeaker 120. Through use of the headset 114, the user of the portableelectronic device 102 is able to hear audio items played by the audiosubsystem 102 as well as voice calls being received by the communicationsubsystem 112. The manner by which the portable electronic device 102facilitates delivery of audio playback of media items as well as audiofor voice calls is described in further detail below.

The portable electronic device 104 includes an audio subsystem 122 and acommunication subsystem 124. The audio subsystem 122 allows the portableelectronic device 104 to playback audio items. The communicationsubsystem 124 allows the portable electronic device 104 to participatein voice calls through the wireless network 108. The portable electronicdevice 104 enables a user to hear audio output from either or both ofthe audio subsystem 122 and the communication subsystem 124 at speakers126 and 128. The speakers 126 and 128 can correspond to left and rightspeakers, respectively. The speakers 126 and 128 can also be referred toas earphones. Again, the manner by which the portable electronic device104 manages the output of audio from the audio subsystem 122 and thecommunication subsystem 124 are discussed in greater detail below.

The portable electronic devices 102 and 104 support not only mediaplayback but also wireless communications. As one example, the portableelectronic devices 102 and 104 can correspond to mobile telephones thatinclude audio capabilities. As another example, the portable electronicdevices 102 and 104 can pertain to media playback devices (e.g.,MP3players) that include communication capabilities. As still anotherexample, the portable electronic devices 102 and 104 can pertain to apersonal digital assistant that includes media playback as well ascommunication capabilities.

In one embodiment, the form factor for the portable electronic devices102 and 104 (as well as the communication device 106) can be hand-held(or palm-sized) or pocket-sized devices. In one embodiment, the formfactor of the portable electronic devices is hand-held or smaller. Theportable electronic devices may, for example, be small and lightweightenough to be carried in one hand, worn, or placed in a pocket (of auser's clothing). Although the form factor is generally small andhand-held (or palm-sized), the configuration of the device can varywidely.

FIG. 2 is a block diagram of a media device 200 suitable for use withthe invention. The media device 200 can illustrate representativecircuitry of the portable electronic devices 102 and 104 in FIG. 1.

The media device 200 includes a processor 202 that pertains to amicroprocessor or controller for controlling the overall operation ofthe media device 200. The media device 200 stores media data pertainingto media items in a file system 204 and a cache 206. The file system 204can be implemented by semiconductor memory (e.g., EEPROM, Flash, etc.)or by at least one storage disk. The file system 204 typically provideshigh capacity storage capability for the media device 200. However,since the access time to the file system 204 is relatively slow, themedia device 200 can also include a cache 206. The cache 206 is, forexample, Random-Access Memory (RAM) provided by semiconductor memory.The relative access time to the cache 206 is substantially shorter thanfor the file system 204. However, the cache 206 does not have the largestorage capacity of the file system 204. Further, the file system 204,when active, consumes more power than does the cache 206. The powerconsumption is often a concern when the media device 200 is a portablemedia player that is powered by a battery (not shown). The media device200 also includes a RAM 220 and a Read-Only Memory (ROM) 222. The ROM222 can store programs, utilities or processes to be executed in anon-volatile manner. The RAM 220 provides volatile data storage, such asfor the cache 206.

The media device 200 also includes a user input device 208 that allows auser of the media device 200 to interact with the media device 200. Forexample, the user input device 208 can take a variety of forms, such asa button, keypad, touchpad, dial, etc. Still further, the media device200 includes a display 210 (screen display) that can be controlled bythe processor 202 to display information to the user. A data bus 211 canfacilitate data transfer between at least the file system 204, the cache206, the processor 202, and the CODEC 212.

In one embodiment, the media device 200 serves to store a plurality ofmedia items (e.g., songs) in the file system 204. When a user desires tohave the media player play a particular media item, a list of availablemedia items can be displayed on the display 210. Then, using the userinput device 208, a user can select one of the available media items.The processor 202, upon receiving a selection of a particular mediaitem, supplies the media data (e.g., audio file) for the particularmedia item to a coder/decoder (CODEC) 212. The CODEC 212 then producesanalog output signals for a speaker 214. The speaker 214 can be aspeaker internal to the media device 200 or external to the media device200. For example, headphones or earphones that connect to the mediadevice 200 would be considered an external speaker.

The media device 200 also includes a bus interface 216 that couples to adata link 218. The data link 218 allows the media device 200 to coupleto a host device (e.g., host computer or power source). The data link218 can also provide power to the media device 200.

The media device 200 further includes wireless communication interface226 and an antenna 228 to support wireless communication. The wirelesscommunication can pertain to voice or data communications. A microphone230 can provide voice pickup for an outgoing voice communication. Theprocessor 202 can also operate to control communications (incoming oroutgoing) via the wireless communication interface 226. In oneembodiment, the processor 202 can execute computer code to effectivelyoperate as an audio manager, a communication manager, a data manager,and a user interface manager.

FIG. 3A is a flow diagram of a call reception process 300 according toone embodiment of the invention. The call reception process 300 is, forexample, performed by a portable electronic device, such as the portableelectronic devices 102 and 104 illustrated in FIG. 1. These portableelectronic devices 102 and 104 support media playback capabilities aswell as communication capabilities.

The call reception process 300 begins with a decision 302 thatdetermines whether a call is incoming. The incoming call is typically avoice call provided over a wireless communication network. When thedecision 302 determines that a call is not incoming, the call receptionprocess 300 awaits an incoming call. On the other hand, when thedecision 302 determines that a call is incoming, the call receptionprocess 300 continues. In other words, the call reception process 300can be deemed to be invoked when a call is incoming.

In any case, once a call is incoming, a decision 304 determines whethermedia playback is active. When the decision 304 determines that mediaplayback is active, media playback is altered 306. Typically, in thisembodiment, the media playback concerns playback of a media item that isalready ongoing when the incoming call arrives. The altering 306 of themedia playback can be implemented in a variety of different ways. In oneimplementation, the media playback is modified but not stopped duringthe incoming call. As one example, the media playback can be directed toone output audio channel, with the incoming call being directed toanother output audio channel. Such an approach will allow the user ofthe portable electronic device to continue to hear the media playbackwhile also hearing the incoming call. As another example, the mediaplayback could be mixed with the incoming call and provided to the userof the portable electronic device as a combined output audio channel.Alternatively, when the decision 304 determines that media playback isnot active, the block 306 is bypassed since there is no media playbackto be altered.

Following the block 306 or its being bypassed, the call receptionprocess 300 outputs 308 the incoming call to one or more audio outputdevices. As an example, the audio output devices can correspond tospeakers. In one implementation, the speakers can be provided on orwithin a housing of the portable electronic device. In anotherimplementation, the speakers can be external speakers associated withearphones or a headset. Following the block 308, the call receptionprocess 300 ends.

FIG. 3B is a flow diagram of a call termination process 350 according toone embodiment of the invention. The call termination process 350 isperformed by a portable electronic device, such as the portableelectronic devices 102 and 104 illustrated in FIG. 1. The calltermination process 350 can be considered counterpart processing to thecall reception process 300 illustrated in FIG. 3A.

The call termination process 350 begins with a decision 352 thatdetermines whether a call has been concluded. When the decision 352determines that a call has not yet been concluded, then the calltermination process 350 awaits termination of the call. On the otherhand, when the decision 352 determines that the call has concluded, thenthe call termination process 350 continues. In other words, the calltermination process 350 is performed when a call terminates.

Once the decision 352 determines that a call has concluded, outputtingof the call to the one or more audio output devices is stopped 354. Adecision 356 then determines whether media playback is active. Here, ifmedia playback was active when the incoming call was received, mediaplayback will typically remain active when the call concludes. Hence,when the decision 356 determines that media playback is active (when thecall concludes), the call termination process 350 un-alters 358 themedia playback. Since the call reception process 300 altered 306 themedia playback when the incoming call arrived, when the call concludesthe media playback is un-altered 358. As a result, the media playback isthereafter able to be output in the same manner that it was outputbefore the incoming call. Alternatively, when the decision 356determines that media playback is not active, then the block 358 isbypassed because no media playback is to be provided. Following theblock 358 or its being bypassed, the call termination process 350 ends.

Although the call reception process 300 illustrated in FIG. 3A and thecall termination process 350 pertain to processes that alter or un-altermedia playback dependent on the presence of a voice call, it should beunderstood that similar processing can be performed in the otherscenarios. As another embodiment, if media playback is initiated when anincoming call is active, then the output of the incoming call can bealtered so that both the incoming call and the media playback can bedirected to the audio output device(s). When the media playback is nolonger active, any prior altering of the output of the incoming call canbe un-altered.

FIGS. 4A-4C are flow diagrams of an audio management process 400according to one embodiment of the invention. The audio managementprocess 400 is, for example, performed by a portable electronic device.Examples of portable electronic devices include the portable electronicdevices 102 and 104 illustrated in FIG. 1.

The audio management process 400 begins with a decision 402 thatdetermines whether an audio start request has been received. Here, auser of the portable electronic device can provide a user input toinvoke an audio start request. When the decision 402 determines that anaudio start request has been received, audio to be played is identified404. User input with respect to the portable electronic device can beused to identify 404 the audio to be played. After the audio to beplayed has been identified 404, playback of the identified audio isinitiated 406. As an example, the playback of the identified audio canbe performed by the audio subsystem 110 of the portable electronicdevice 102 illustrated in FIG. 1.

Following the block 406, as well as directly following the decision 402when an audio start request is not received, a decision 408 determineswhether an audio start/pause request has been received. The audiostart/pause request can be initiated by user input with respect to theportable electronic device. When the decision 408 determines that anaudio stop/pause request has been received, playback of the identifiedaudio is stopped or paused 410.

Following the block 410, as well as directly following the decision 408when an audio stop/pause request has not been received, a decision 412determines whether an incoming call has been answered. When the decision412 determines that an incoming call has been answered, a decision 414determines whether audio playback is active. When the decision 414determines that audio playback is not active, the call audio (i.e.,audio for the call) is directed 416 to left and right channels. The leftand right channels can, for example, correspond to left and rightspeakers. Alternatively, when the decision 414 determines that audioplayback is active, audio playback is directed 418 to a left channel andthe call audio is directed 420 to a right channel.

Following the blocks 416 or 420 as well as directly following thedecision 412 when an incoming call is not answered, a decision 422determines whether a channel control action has been received. A channelcontrol action can be associated with a user input that impacts channelassignments or properties. Hence, when the decision 422 determines thata channel control action has been received, channel assignments orproperties are altered 424. For example, the channel assignments can bealtered 424 by a toggling action that switches different audio channelsto different speakers. The channel properties can be altered 424 byadjusting the blending or mixing of different audio channels beforebeing output to a speaker.

Following the block 424, as well as following the decision 424 when achannel control action is not received, a decision 426 determineswhether a call has concluded. When the decision 426 determines that acall has concluded, a decision 428 determines whether audio playback isactive. Audio playback can be deemed active if the audio playback isactive when the call concludes or can be deemed active if audio playbackwas active when a call was received. When the decision 428 determinesthat the audio playback is active, audio playback can be directed 430 toleft and right channels. Previously, during the call, the audio playbackwas directed 418 to the left channel and not to the right channelbecause the right channel carried the call audio. Now, since the callhas concluded, the audio playback can be again directed 430 to both theleft and right channels. With both left and right channels beingavailable for audio playback, the audio playback can be provided instereo. Alternatively, when the decision 428 determines that audioplayback is not active, left and right channels can be disabled 432.Following the blocks 430 and 432 as well as directly following thedecision 426 when the call has not concluded, the audio managementprocess 400 can return to repeat the decision 402 and subsequent blocksso that subsequent requests can be similarly processed.

FIG. 5 is a table 500 illustrating representative audio management for aportable electronic device supporting both audio playback and wirelessvoice communications according to one embodiment of the invention. Theportable electronic device has a left audio channel and a right audiochannel that are able to carry audio signals to left and right speakers,respectively. Hence, the assignment of audio sources for the left andright channels determines what audio content or information is providedto a user of the portable electronic device via the left and rightspeakers. Four different audio management scenarios are illustrated inthe table 500. In the first scenario, audio playback is active but avoice call is inactive. In this case, the media playback is provided toboth left and right channels in a stereo fashion. In the secondscenario, audio playback is not active but a voice call is active. Inthis case, the voice call can be output to one or both of the left andright channels. In a third scenario, audio playback as well as a voicecall are active. In this case, the audio playback is provided in a monofashion to the left channel and the voice call is provided to the rightchannel. Typically, in this situation, a user may interact with theportable media device to alter the channel assignments. For example, bypressing a switch or other input means, the user can cause the audioplayback at the left channel to stop and instead provide the voice callto both the left and right channels. In a fourth scenario, the audioplayback and voice call are both active. In this case, the audio outputto the left and right channels can be a mixture of the audio provided byaudio playback and the audio provided by the voice call. A user inputaction can enable a user to alter the characteristics of the audiomixture. For example, a user input could pause or stop the audioplayback. As another example, a user input action could enable a user toalter the relative mixture of the voice call and the audio playback.

FIG. 6 is an exemplary display screen 600 suitable for use on a portableelectronic device according to one embodiment of the invention. Thedisplay screen 600 can be presented by a display device associated withthe portable electronic device. The display screen 600 includes a blendcontrol 602. The blend control 602 allows a user of the portableelectronic device to alter the blend (or mixture) of audio from audioplayback and audio from a voice call. The blend control 602 isparticularly useful for the fourth scenario discussed above withreference to FIG. 5. The blend control 602 includes a slider 604 thatcan be manipulated by a user towards either an audio end 606 or a callend 608. As the slider 604 is moved towards the audio end 606, the audioplayback output gets proportionately greater than the voice call output.On the other hand, when the slider 604 is moved towards the call end608, the voice call output gets proportionally greater than the audioplayback output. For example, the position of the slider 604 canrepresent a mixture of the audio playback output and the voice calloutput with each amplified similarly so that the mixture isapproximately 50% audio.

Alternatively, instead of using different audio channels, the audio tobe concurrently output from an incoming call and media playback can bealtered to provide acoustic separation. The audio for each can bealtered such that the audio from the incoming call and the audio fromthe media playback are perceived by a listener (when output to a pair ofspeakers, either internal or external) as originating from differentvirtual locations. The different virtual locations can be defaultpositions or user-specified (during playback or in advance). Additionaldetails on establishing or setting virtual location are discussed below.

Another aspect of the invention pertains to transmitting media data fromone electronic device to another electronic device while engaging in avoice call between the electronic devices.

In one embodiment, an audio subsystem on an electronic device cancontrol audio output device(s) to produce audio sounds pertaining to amedia item. The audio sounds can be directed to a user of the portableelectronic device by way of the audio output device(s) (e.g.,speaker(s)) within or attached to the electronic device. An attachedspeaker can be in an earphone or a headset. In addition, the audio soundgenerated at one portable electronic device can be directed to anotherelectronic device together with audio associated with a voice call.Here, audio for the voice call can be mixed with the audio for the mediaplayback and then transmitted to the another electronic device. Themixed audio can then be output to one or more audio output device(s)(e.g., speakers) associated with the another electronic device. In oneimplementation, instead of being mixed, the voice call and the mediaplayback can be transmitted using separate channels. In such case, theanother electronic device can play the audio for the voice call and themedia playback using separate speakers if desired. Also, in such a case,a user of the another electronic device is able to separately controlthe volume of the different audio channels. As an alternative,predetermined sound effects, which can also be considered media items,can be likewise directed to other portable electronic devices during avoice call.

The sender or recipient of the audio sounds pertaining to a media itemcan be permitted to separately control the volume or amplitude of theaudio sounds pertaining to the media item. As a result, the mixture orblend of the audio sounds pertaining to media items as compared to audiosounds pertaining to the voice call can be individually or relativelycontrolled.

Still another aspect of the invention pertains to a multi-partycommunication environment. The various parties to a multi-partycommunication can be spatially placed such that one or more of theparties is able to more easily distinguish the different parties.

FIG. 7 is a diagram of a multi-party conference system 700 according toone embodiment of the invention. The multi party conference system 700includes a wireless network 702 and a wired network 704. The multi-partyconference system 700 also includes a plurality of portable electronicdevices, including portable communication device 706 (referred to asdevice A), portable communication device 708 (referred to as device B)and portable communication device 710 (referred to as device C). Theseportable communication devices 706-710 couple to a wireless network 702.Additionally, the multi-party conference system 700 includes astationary communication device 712 (referred to as device B). Thestationary communication device 712 can, for example, pertain to adesktop computer or a telephone. Typically, the communication device 712would couple to the wired network 704 over a wired link 714. However,the link 714 could alternatively include a wireless link.

From the perspective of the portable communication device 706 (deviceA), the multi-party conference system 700 is further described. In thisembodiment, the portable communication device 706 includes a headset 716that couples (wirelessly or wired) to the portable communication device706. Here, the portable communication device 706 is assumed to beparticipating in a multi-party conference call with the users of theportable communication devices 708 and 710 as well as the stationarycommunication device 712. The wired network 704 and/or the wirelessnetwork 702 can provide a central office and switching devices needed tohave the users of these devices participate in a multi-party call.

According to one aspect of the invention, the user of the device A wearsthe headset 716 while participating in the multi-party call. That is,the user of the device A 706 hears each of the other participants of thecall through the headset 716. Here, it should be noted that the headset716 includes a left speaker as well as a right speaker. To assist theuser of the device A 706 in determining and distinguishing the differentparticipants in the multi-party call, directional audio processing canbe utilized so that the different sources of audio for the call can bedirectionally placed in a particular location with respect to theheadset 716. As a result, the user of the device A 706 hears the otherparticipants in the multi-party call as sound sources originating fromdifferent locations.

Although the invention works well for a user wearing a headset, in otherembodiments the user hears the audio from other two speaker apparatuses.In one implementation, the two speakers are provided as a pair ofearphones. In another implementation, the two speakers are provided as apair of speakers adjacent or embedded in a computer or a computerperipheral.

In one embodiment, the form factor for the portable communicationdevices 706-710 can be hand-held (or palm-sized) or pocket-sizeddevices. In one embodiment, the form factor of the portablecommunication devices is hand-held or smaller. The portablecommunication devices may, for example, be small and lightweight enoughto be carried in one hand, worn, or placed in a pocket (of a user'sclothing). Although the form factor is generally small and hand-held (orpalm-sized), the configuration of the device can vary widely.

FIG. 8A is a flow diagram of a spatial conference process 800 accordingto one embodiment of the invention. The spatial conference process 800can, for example, be performed by an electronic device, such as any ofthe communication devices 706-712 illustrated in FIG. 7. Alternatively,the spatial conference process 800 can be performed by a centralcomputing device residing in or coupled to a network, such as thewireless network 702 or the wired network 704.

The spatial conference process 800 begins with a decision 802 thatdetermines whether a multi-party call exists. Here, the spatialconference process 800 is provided and described with respect to aparticular electronic device, such as the portable electronic device 706illustrated in FIG. 7. When the decision 802 determines that amulti-party call is not present, the spatial conference process 800 isnot further performed. On the other hand, when the decision 802determines that a multi-party call is present, the spatial conferenceprocess 800 continues. In other words, the spatial conference process800 can be deemed invoked when a multi-party call is present.

When the decision 802 determines that a multi-party call is present,participants are assigned 804 to virtual positions. A decision 806 thendetermines whether call audio is being received. When the decision 806determines that call audio is being received, the participant associatedwith the call audio is identified 808. The call audio can then beadapted 810 based on the virtual position of the identified participant.The adapted call audio is then output 812.

Following the block 812, as well as directly following the decision 806when call audio is not being received, a decision 814 determines whetherthe call has concluded. When the decision 814 determines that the callhas not yet concluded, the spatial conference process 800 returns torepeat the decision 806 and subsequent blocks. On the other hand, whenthe decision 814 determines that a call has concluded, the spatialconference process 800 ends.

FIG. 8B is a flow diagram of a spatial conference process 850 accordingto one embodiment of the invention. The spatial conference process 850can, for example, be performed by an electronic device, such as any ofthe communication devices 706-712 illustrated in FIG. 7. Alternatively,the spatial conference process 850 can be performed by a centralcomputing device residing in or coupled to a network, such as thewireless network 702 or the wired network 704.

The spatial conference process 850 begins with a decision 852 thatdetermines whether a multi-party call exists. Here, the spatialconference process 850 is provided and described with respect to aparticular electronic device, such as the portable electronic device 706illustrated in FIG. 7. When the decision 852 determines that amulti-party call is not present, the spatial conference process 850 isnot further performed. On the other hand, when the decision 852determines that a multi-party call is present, the spatial conferenceprocess 850 continues. In other words, the spatial conference process850 can be deemed invoked when a multi-party call is present.

When the decision 852 determines that a multi-party call is present,participants are initially assigned 854 to default positions. Here, thedefault positions can be assigned 854 in a variety of different ways. Inone implementation, the assignment to the default positions isautomatic. In one implementation, the participants can be assigned 854to a default position based on their geographic location relative to thelocation of the host party, which refers to the user of the portableelectronic device 706. Alternatively, the default position could beassigned 854 based on an order at which the participants joined themulti-party call.

Next, a participant position screen is displayed 856. The participantposition screen can enable a user (such as the user of the portablecommunication device 706) to alter the position of one or more of theparticipants to the multi-party call. Here, the participant positionscreen is displayed 856 such that a user of the portable communicationdevice can manipulate or otherwise cause one or more of the positionsassociated with the participants to be changed. In doing so, the user,in one embodiment, can cause the physical movement of a representationof a participant on the participant position screen. Here, a decision858 determines whether a reposition request has been made. When thedecision 858 determines that a reposition request has been made, theassociated participant is moved 860 to the specified position.Typically, the user of the portable communication device would be theperson that moves 860 a representation of the associated participant tothe specified position. In response to the movement 860, the participantposition screen is refreshed 862. In one implementation, the refreshing862 is provided as the representation of the associated participant ismoved 860.

Following the block 862, or directly following the decision 858 when areposition request has not been made, a decision 864 determines whetherthe multi-party call has concluded. When the decision 864 determinesthat the multi-party call has not been concluded, the spatial conferenceprocess 850 returns to repeat the decision 858 and subsequent blocks sothat repositioning can be achieved if desired. Alternatively, when thedecision 864 determines that the multi-party call has been concluded,the spatial conference process 850 ends.

FIG. 9 is a diagram of a virtual space 900 for a multi-party conferencecall according to one embodiment of the invention. The virtual space 900is provided with reference to a headset 902. In this example, themulti-party call is between four participants. The host participant canbe deemed associated with device A and the headset 902 coupled thereto.The other participants are associated with devices B, C and D. Accordingto one arrangement for a multi-party call, the virtual space 900illustrates that the device B is placed at a virtual position 904, thedevice C is placed at a virtual position 906, and the device D is placedat a virtual position 908. Consequently, the user of the device A (andthus the headset 902) while participating in the multi-party conferencehears the user of the device B as originating from the virtual location904. In addition, the user of the device A (and thus the headset 902)would hear the audio provided by the device C as originating from thevirtual location 906. Still further, the user of the headset 902 hearsthe user of the device D as originating from the virtual location 908.Hence, the user of the device A (and thus the headset 902) would hearthe audio provided by the device B as originating from one side (e.g.,left side). Similarly, the user of the device A (and thus the headset902) would hear the audio provided by the device C as originating froman opposite side (e.g., right side). In addition, the user of the deviceA (and thus the headset 902) would hear the audio provided by the deviceD as originating from a forward direction. Although the virtual space900 is provided with reference to a headset 902, the virtual space canalso be provided for other two speaker arrangements.

FIG. 10A is an exemplary representation of a conference call screen 1000according to one embodiment of the invention. The conference call screen1000 indicates that participants P_(B), P_(C) and P_(D) areparticipating in the multi-party conference call. The conference callscreen 1000 can be associated with and presented on a communicationdevice, such as the portable communication device 706 illustrated inFIG. 7.

FIG. 10B is a diagram of an exemplary representation of a multi-partyparticipant position screen 1020 according to one embodiment of theinvention. The multi-party participant position screen 1020 illustratespositioning of visual representations of the participants to amulti-party call to locations of the multi-party participant positionscreen 1020. For example, the participant P_(B) is placed in a topportion 1022, the participant P_(C) is placed at a left portion 1024,and the participant P_(D) is placed a right portion 1026. The bottomportion 1028 does not include any participant in this example. Theparticipant position screen 1020 can be associated with and presented ona communication device, such as the portable communication device 706illustrated in FIG. 7. A user can also be permitted to interact with theportable communication device providing the portable communicationdevice so as to cause the visual representations of one or more of theparticipants to move to a different portion. For example, the user canprovide user input so that the participant P_(B) is moved from the a topportion 1022 to the bottom portion 1028.

FIG. 10C is a diagram of a multi-party participant position screen 1040including participant information according to one embodiment of theinvention. The participant position screen 1040 illustrates the sameportions 1022-1028 as in the multi-party participant position screen1020 illustrated in FIG. 10B. In FIG. 10C, the multi-party participantposition screen 1040 displays information concerning each of theparticipants to the multi-party call. As an example, the information onthe participants can include: name, company, location and type ofcommunication device. For example, the location can pertain to thecompany address, their home address or their actual position. Theiractual position, for example, can be acquired by a Global PositioningSystem (GPS) device associated with the participant. The type ofcommunication device being utilized by the participant can also bedenoted, such as cell phone, work phone, home phone, work computer, etc.Beyond the information displayed in the portions 1022-1026 as shown inFIG. 10C, the portions 1022-1026 can also display the visualrepresentations of the participants similar as in the participantposition screen 1020 of FIG. 10B.

In the event that there are more than four participants, a larger numberof portions can be used. FIG. 10D is a diagram of another exemplaryrepresentation of a multi-party participant position screen 1060according to one embodiment of the invention. The multi-partyparticipant position screen 1060 provides distinct portions 1062-1076that can be used to spatially distinguish up to eight differentparticipants. Visual representations and/or information can be displayedin these portions 1062-1076. A user can also be permitted to interactwith the portable communication device so as to cause the visualrepresentations of one or more of the participants to move to adifferent portion.

As discussed above, incoming audio from a participant is adapted so thatwhen output to speakers associated with an electronic device, the audiosounds as if it originates from a particular direction. The particulardirection is from a virtual position. With multiple participants,different participants are associated with different virtual positionsand thus different participants have their audio appear to originatefrom different directions.

In one embodiment, the electronic device or central computing device canautomatically identify the different participants and appropriatelyadapt their audio so as to appear to originate from a correspondingvirtual location. In doing so, the electronic device or the centralcomputing device can operate to distinguish audio from the differentparticipants through a variety of means. In one implementation, theaudio from a particular participant can be distinguished using a networkaddress associated with a digital transmission of the audio. In anotherimplementation, voice recognition technology can be utilized todistinguish the different participants. For example, each participantcan provide a sample of their voice to the system, then the system canthereafter match incoming audio with one of the participants using thevoice samples. In still another implementation, a unique code can beused by each of the participants and transmitted with the audio. Theunique code can be on a separate channel (e.g., back channel or controlchannel). Alternatively, the unique code can be sent as audio in a codedmanner or in a frequency band beyond user's hearing.

The various aspects, embodiments, implementations or features of theinvention can be used separately or in any combination.

The invention can be implemented by software, hardware or a combinationof hardware and software. The invention can also be embodied as computerreadable code on a computer readable medium. The computer readablemedium is any data storage device that can store data which canthereafter be read by a computer system. Examples of the computerreadable medium include read-only memory, random-access memory, CD-ROMs,DVDs, memory cards, magnetic tape, optical data storage devices, andcarrier waves. The computer readable medium can also be distributed overnetwork-coupled computer systems so that the computer readable code isstored and executed in a distributed fashion.

The advantages of the invention are numerous. Different aspects,embodiments or implementations may yield one or more of the followingadvantages. One advantage of the invention is that a user of anelectronic device, even a portable electronic device, can receive mediaplayback while participate in a voice call. Another advantage of theinvention is that audio can be mixed and transmitted along with audiofor a voice call. Still another advantage of the invention is thatdifferent virtual spatial locations can be associated with differentparticipants of a multi-party call.

The many features and advantages of the present invention are apparentfrom the written description. Further, since numerous modifications andchanges will readily occur to those skilled in the art, the inventionshould not be limited to the exact construction and operation asillustrated and described. Hence, all suitable modifications andequivalents may be resorted to as falling within the scope of theinvention.

1. A graphical user interface for use in managing virtual spatiallocations for a plurality of participants to a multi-party call, thegraphical user interface comprising: a plurality of visually distinctregions placed in different positions in a display window; a visualindication for each of the participants of the multi-party call, whereinthe visual indication is movable to any of the visually distinct regionsvia navigation input through the graphical user interface, therebyassigning the corresponding participant to a particular visuallydistinct region; and wherein audio associated with the participants isspatially adapted to originate from a location corresponding to theparticular visually distinct region to which the visual indication foreach of the participants is assigned.
 2. The graphical user interface ofclaim 1, wherein each participant has a visual indication assigned to adifferent visually distinct region.
 3. The graphical user interface ofclaim 1, further comprising: a visual indication for audio playback of amedia item, wherein the visual indication for audio playback of a mediaitem is assigned to a different visually distinct region than any of theparticipants of the multi-party call.
 4. The graphical user interface ofclaim 3, wherein audio associated with the audio playback of the mediaitem is spatially adapted to originate from a location corresponding tothe particular visually distinct region to which the visual indicationfor the audio playback of the media item is assigned.
 5. The graphicaluser interface of claim 1, wherein the navigation input through thegraphical user interface is received via user input from a navigationdevice.
 6. A method for spatially distinguishing each of a plurality ofparticipants in a multi-party call on a communication device havingassociated therewith an audio output device capable of projecting audioso that it sounds as if it is originating from various positionssurrounding the communication device, the method comprising: receivinguser input indicating an assignment of a visually distinct region in adisplay window of a graphical user interface to one of the participantsin the multi-party call; providing a visual indication in the graphicaluser interface of a correspondence between the visually distinctionregion and the one of the participants; and based on the assignment ofthe visually distinct region to one of the participants in themulti-party call, spatially adapting audio from the one of theparticipants to originate from a location corresponding to the visuallydistinct region.
 7. The method of claim 6, wherein the communicationdevice has at least two speakers.
 8. The method of claim 6, furthercomprising: assigning an audio playback of a media item to an unassignedvisually distinct region so as to distinguish between call audio and theaudio playback; spatially adapting the audio playback to originate froma location corresponding to the visually distinct region assigned to theaudio playback; and concurrently presenting the spatially adapted audiofrom the multi-party call and the spatially adapted audio from the mediaitem on the at least two speakers so that they appear to originate fromdifferent spatial locations.
 9. The method of claim 6, wherein thecommunication device is a mobile telephone.
 10. The method of claim 6,further comprising: providing a blend control, wherein a user maycontrol the blend control to alter the mixture of audio from audioplayback of the media item and audio from the multi-party call.
 11. Aportable communication device comprising: at least two speakersavailable for audio output; a communication subsystem that supports amulti-party call, the multi-party call being between a user of theportable communication device and a plurality of other participants; agraphical user interface including: a plurality of visually distinctregions placed in different positions in a display window; and a visualindication for at least one of the participants of the multi-party call,wherein the visual indication is movable to any of the visually distinctregions via navigation input through the graphical user interface,thereby assigning the at least one participant to a particular visuallydistinct region; and an audio manager operatively coupled to thecommunication subsystem, the graphical user interface, and the at leasttwo speakers and configured to spatially adapt audio associated with theat least one of the participants to originate from a locationcorresponding to the particular visually distinct region to which thevisual indication for the at least one of the participants is assigned.12. The portable communication device of claim 11, further comprising:an audio playback subsystem configured to play back one or more storedmedia items.
 13. The portable communication device of claim 12, whereinthe audio manager is further configured to determine whether audioplayback of the one or more stored media items is to be provided whileengaging in a call, and directing audio output so that a user of theportable communication device can hear not only the call but also theaudio playback of the one or more stored media items when it isdetermined that the audio playback is provided while engaging in thecall.
 14. The portable communication device of claim 13, wherein theaudio playback of the one or more stored media items is spatiallyadapted to originate from a location not associated with any of theparticipants of the multi-party call.
 15. The portable communicationdevice of claim 13, wherein the at least two speakers are associatedwith a handset that operatively connects to said portable communicationdevice.
 16. A device capable of managing a multi-party call and having adisplay, comprising: means for providing a plurality of visuallydistinct regions placed in different positions in a display window ofthe display; means for providing a visual indication for at least one ofthe participants of the multi-party call, wherein the visual indicationis movable to any of the visually distinct regions via navigation inputthrough the graphical user interface, thereby assigning the at least oneparticipant to a particular visually distinct region; and means forspatially adapting audio for each of the participants of the multi-partycall to match the visually distinct region assigned to each of theparticipants.
 17. The device of claim 16, further comprising: means forstoring one or more media items; and means for playing the one or moremedia items concurrently with playing the spatially adapted audio foreach of the participants of the multi-party call.
 18. The device ofclaim 17, further comprising: means for spatially adapting the one ormore media items to match an unassigned visually distinct region. 19.The device of claim 17, further comprising: wherein the one or moremedia items are videos.
 20. The device of claim 16, wherein themulti-party call is a video conference call.